Automatic Acquisition of Possible Contexts for Low-Frequent Words
نویسنده
چکیده
The present work constitutes a PhD project that aims to overcome the problem caused by data sparsity in the task of acquisition of lexical resources. In any corpus of any length, many words are infrequent, thus they co-occur with a small set of words. Nevertheless, they can co-occur with many other words. Our goal is to discover some more possible co-occurring words for low-frequent words relying on other co-occurrences observed in corpus. Our approach aims to formulate a new similarity measure, based on the words usage in language, to approve a transfer of co-occurring words, from a frequent word to a low-frequent
منابع مشابه
Automatic Acquisition of Two-Level Morphological Rules
We describe and experimentally evaluate a complete method for the automatic acquisition of two-level rules for morphological analyzers/generators. The input to the system is sets of source-target word pairs, where the target is an inflected form of the source. There are two phases in the acquisition process: (1) segmentation of the target into morphemes and (2) determination of the optimal two-...
متن کاملEBL2: An Approach To Automatic Lexical Acquisition
A method for automatic lexical acquisition is out lined. An existing lexicon that, in addition Io ordinary ]exical entries, contains prototypical cntrips for various non-exclusive paradigms of open-cl~,.ss words, is extended by inferring new lexical entries from texts containing unknown words. This is done by comparing the constraints placed on the unknown words hy the natural language system's...
متن کاملAutomatic Acquisition of Synonyms Using the Web as a Corpus
We present an original algorithm for automatic acquisition of synonyms from text. The algorithm measures the semantic similarity between pairs of words by comparing their local contexts extracted from the Web by series of queries against the Google search engine. The results show 11pt average precision of 63.16%.
متن کاملContext Feature Selection for Distributional Similarity
Distributional similarity is a widely used concept to capture the semantic relatedness of words in various NLP tasks. However, accurate similarity calculation requires a large number of contexts, which leads to impractically high computational complexity. To alleviate the problem, we have investigated the effectiveness of automatic context selection by applying feature selection methods explore...
متن کاملSemantic Content Acquisition and Representation ( SCAR ) 2007
Given a target word wi to be disambiguated, we define a class of local contexts for wi such that the sense of wi is univocally determined. We call such local contexts sense discriminative and represent them with sense discriminative (SD) patterns of lexico-syntactic features. We describe an algorithm for the automatic acquisition of minimal SD patterns based on training data in SemCor. We have ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011